26 research outputs found

    Predicting Proteome-Early Drug Induced Cardiac Toxicity Relationships (Pro-EDICToRs) with Node Overlapping Parameters (NOPs) of a new class of Blood Mass-Spectra graphs

    Get PDF
    The 11th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryBlood Serum Proteome-Mass Spectra (SP-MS) may allow detecting Proteome-Early Drug Induced Cardiac Toxicity Relationships (called here Pro-EDICToRs). However, due to the thousands of proteins in the SP identifying general Pro-EDICToRs patterns instead of a single protein marker may represents a more realistic alternative. In this sense, first we introduced a novel Cartesian 2D spectrum graph for SP-MS. Next, we introduced the graph node-overlapping parameters (nopk) to numerically characterize SP-MS using them as inputs to seek a Quantitative Proteome-Toxicity Relationship (QPTR) classifier for Pro-EDICToRs with accuracy higher than 80%. Principal Component Analysis (PCA) on the nopk values present in the QPTR model explains with one factor (F1) the 82.7% of variance. Next, these nopk values were used to construct by the first time a Pro-EDICToRs Complex Network having nodes (samples) linked by edges (similarity between two samples). We compared the topology of two sub-networks (cardiac toxicity and control samples); finding extreme relative differences for the re-linking (P) and Zagreb (M2) indices (9.5 and 54.2 % respectively) out of 11 parameters. We also compared subnetworks with well known ideal random networks including Barabasi-Albert, Kleinberg Small World, Erdos-Renyi, and Epsstein Power Law models. Finally, we proposed Partial Order (PO) schemes of the 115 samples based on LDA-probabilities, F1-scores and/or network node degrees. PCA-CN and LDA-PCA based POs with Tanimoto’s coefficients equal or higher than 0.75 are promising for the study of Pro-EDICToRs. These results shows that simple QPTRs models based on MS graph numerical parameters are an interesting tool for proteome researchThe authors thank projects funded by the Xunta de Galicia (PXIB20304PR and BTF20302PR) and the Ministerio de Sanidad y Consumo (PI061457). González-Díaz H. acknowledges tenure track research position funded by the Program Isidro Parga Pondal, Xunta de Galici

    Net-Net Auto Machine Learning (AutoML) Prediction of Complex Ecosystems

    Get PDF
    Biological Ecosystem Networks (BENs) are webs of biological species (nodes) establishing trophic relationships (links). Experimental confirmation of all possible links is difficult and generates a huge volume of information. Consequently, computational prediction becomes an important goal. Artificial Neural Networks (ANNs) are Machine Learning (ML) algorithms that may be used to predict BENs, using as input Shannon entropy information measures (Sh(k)) of known ecosystems to train them. However, it is difficult to select a priori which ANN topology will have a higher accuracy. Interestingly, Auto Machine Learning (AutoML) methods focus on the automatic selection of the more efficient ML algorithms for specific problems. In this work, a preliminary study of a new approach to AutoML selection of ANNs is proposed for the prediction of BENs. We call it the Net-Net AutoML approach, because it uses for the first time Shk values of both networks involving BENs (networks to be predicted) and ANN topologies (networks to be tested). Twelve types of classifiers have been tested for the Net-Net model including linear, Bayesian, trees-based methods, multilayer perceptrons and deep neuronal networks. The best Net-Net AutoML model for 338,050 outputs of 10 ANN topologies for links of 69 BENs was obtained with a deep fully connected neuronal network, characterized by a test accuracy of 0.866 and a test AUROC of 0.935. This work paves the way for the application of Net-Net AutoML to other systems or ML algorithms.The authors acknowledge Basque Government (Eusko Jaurlaritza) grant (IT1045-16) - 2016-2021 for consolidated research groups. This work was supported by the "Collaborative Project in Genomic Data Integration (CICLOGEN)" PI17/01826 funded by the Carlos III Health Institute, as part of the Spanish National plan for Scientific and Technical Research and Innovation 2013-2016 and the European Regional Development Funds (FEDER). This project was also supported by the General Directorate of Culture, Education and University Management of Xunta de Galicia ED431D 2017/16 and "Drug Discovery Galician Network" Ref. ED431G/01 and the "Galician Network for Colorectal Cancer Research" (Ref. ED431D 2017/23), and finally by the Spanish Ministry of Economy and Competitiveness for its support through the funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13-3503) and the European Regional Development Funds (FEDER) by the European Union. CR Munteanu acknowledges the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. ConsellerĂ­a de EconomĂ­a e Industria; 10SIN105004P

    Unify Markov model for Rational Design and Synthesis of More Safe Drugs. Predicting Multiple Drugs Side Effects

    Get PDF
    The 9th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryMost of present mathematical models for rational design and synthesis of new drugs consider just the molecular structure. In the present article we pretend extending the use of Markov Chain models to define novel molecular descriptors, which consider in addition other parameters like target site or biological effect. Specifically, this model takes into consideration not only the molecular structure but the specific biological system the drug affects too. Herein, it is developed a general Markov model that describes 19 different drugs side effects grouped in 8 affected biological systems for 178 drugs, being 270 cases finally. The data was processed by Linear Discriminant Analysis (LDA) classifying drugs according to their specific side effects, forward stepwise was fixed as strategy for variables selection. The average percentage of good classification and number of compounds used in the training/predicting sets were 100/95.8% for endocrine manifestations(18 out of 18)/(13 out of 14); 90.5/92.3% for gastrointestinal manifestations (38 out of 42)/(30 out of 32); 88.5/86.5% for systemic phenomena (23 out of 26)/(17 out of 20); 81.8/77.3% for neurological manifestations (27 out of 33)/(19 out of 25); 81.6/86.2% for dermal manifestations (31 out of 38)/(25 out of 29); 78.4/85.1% for cardiovascular manifestation (29 out of 37)/(24 out of 28); 77.1/75.7% for breathing manifestations (27 out of 35)/(20 out of 26) and 75.6/75% for psychiatric manifestations (31 out of 41)/(23 out of 31). Additionally a Back-Projection Analysis (BPA) was carried out for two ulcerogenic drugs to prove in structural terms the physic interpretation of the models obtained. This article develops a model that encompasses a large number of drugs side effects grouped in specifics biological systems using stochastic absolute probabilities of interaction (Apk (j)) by the first time

    QSAR for Anti-RNA-Virus Activity, Synthesis, and Assay of Anti-RSV Carbonucleosides Given an Unify Representation of Spectraö Moments, Quadratic, and Topologic Indices

    Get PDF
    The 9th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryThe unify representation of spectral moments, classic topologic indices, quadratic indices, and stochastic molecular descriptors shown that all these molecular descriptors lie within the same family. Consequently, the same priori probability for a success quantitative-structure-activity-relationship (QSAR) may be expected no matter which indices are selected. Herein, we used stochastic spectral moments as molecular descriptors to seek a QSAR using a database of 221 bioactive compounds previously tested against diverse RNA-viruses and 402 non-active ones. The QSAR model thus obtained correctly classifies 90.9 % of compounds in training. The model also correctly classifies a total of 87.9 % of 207 compounds on additional external predicting series, 73 of them having anti-RNA-virus activity and 134 non-active ones. In addition, all compounds were regrouped into five different subsets for leave-group-out studies: 1) antiinfluenza, 2) anti-picornavirus, 3) anti-paramyxovirus, 4) anti-RSV/anti-influenza, and 5) broad range anti-RNA-virus activity. The model has retained overall accuracies about 90 % on these studies validating model robustness. Finally, we exemplify the practical use of the model with the discovery of compounds 124 and 128. These compounds presented MIC50 values = 3.2 and 8 µg/mL against respiratory syncytial virus (RSV) respectively. Both compounds have also low cytotoxicity expressed by their Minimal Cytotoxic Concetrations > 400 µg/mL for HeLa cells. The present approach represent and effort toward a formalization and application of molecular indices in bioinformatics, bioorganic and medicinal chemistryAuthors would like to express their gratitude by partial financial support to the Department of Organic Chemistry, University of Santiago de Compostel

    Markovian Chemicals “in silico” Design (MARCH-INSIDE), a Promising Approach for Computer-Aided Molecular Design III: 2.5D Indices for the Discovery of Antibacterials

    Get PDF
    The 9th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryThe present work continues our series on the use of MARCH-INSIDE molecular descriptors [parts I and II: J. Mol. Mod. (2002) 8: 237-245 and (2003) 9: 395-407]. These descriptors encode information regarding to the distribution of electrons in the molecule based on a simple stochastic approach to the idea of electronegativity equalization (Sanderson’s principle). Here, 3D-MARCH-INSIDE molecular descriptors for 667 organic compounds are used as input for a Linear Discriminant Analysis. This 2.5D-QSAR model discriminates between antibacterial compounds and non-antibacterial ones with a 92.9 % of accuracy in training sets. On the other hand, the model classifies correctly 94.0 % of the compounds in test set. Additionally, the present QSAR performs similar-to-better than other methods reported elsewhere. Finally, the discovery of a novel compound illustrates the use of the method. This compound, 2-bromo-3-(furan-2-yl)-3-oxo-propionamide have MIC50 of 6.25 and 12.50 µg/mL against Ps. Aeruginosa ATCC 27853 and E. Coli ATCC 27853 respectively while ampicillim, amoxicillim, clindamycin, and metronidazole have, for instance, MIC50 values higher 250 µg/mL against E. Coli. Consequently, the present method may becomes a useful tool for the in silico discovery of antibacterialsWe thank the Spanish Ministry of Science and Technology (SAF2003-02222), for partial financial support. Molina RR, Castañedo C, and Almeida SM, acknowledges support from the Universität Rostock, German

    Complex Networks and Machine Learning: From Molecular to Social Sciences

    Get PDF
    Combining complex networks analysis methods with machine learning (ML) algorithms have become a very useful strategy for the study of complex systems in applied sciences. Noteworthy, the structure and function of such systems can be studied and represented through the above-mentioned approaches, which range from small chemical compounds, proteins, metabolic pathways, and other molecular systems, to neuronal synapsis in the brain's cortex, ecosystems, the internet, markets, social networks, program's development in education, social learning, etc. On the other hand, ML algorithms are useful to study large datasets with characteristic features of complex systems. In this context, we decided to launch one special issue focused on the benefits of using ML and complex network analysis (in combination or separately) to study complex systems in applied sciences. The topic of the issue is: Complex Networks and Machine Learning in Applied Sciences. Contributions to this special issue are highlighted below. The present issue is also linked to conference series, MOL2NET International Conference on Multidisciplinary Sciences, ISSN: 2624-5078, MDPI AG, SciForum, Basel, Switzerland. At the same time, the special issue and the conference are hosts for the works published by students/tutors of the USEDAT: USA-Europe Data Analysis Training Worldwide Program

    Toward the computer-aided discovery of FabH inhibitors. Do predictive QSAR models ensure high quality virtual screening performance?

    No full text
    Antibiotic resistance has increased over the past two decades. New approaches for the discovery of novel antibacterials are required and innovative strategies will be necessary to identify novel and effective candidates. Related to this problem, the exploration of bacterial targets that remain unexploited by the current antibiotics in clinical use is required. One of such targets is the β-ketoacyl-acyl carrier protein synthase III (FabH). Here, we report a ligand-based modeling methodology for the virtual-screening of large collections of chemical compounds in the search of potential FabH inhibitors. QSAR models are developed for a diverse dataset of 296 FabH inhibitors using an in-house modeling framework. All models showed high fitting, robustness, and generalization capabilities. We further investigated the performance of the developed models in a virtual screening scenario. To carry out this investigation, we implemented a desirability-based algorithm for decoys selection that was shown effective in the selection of high quality decoys sets. Once the QSAR models were validated in the context of a virtual screening experiment their limitations arise. For this reason, we explored the potential of ensemble modeling to overcome the limitations associated to the use of single classifiers. Through a detailed evaluation of the virtual screening performance of ensemble models it was evidenced, for the first time to our knowledge, the benefits of this approach in a virtual screening scenario. From all the obtained results, we could arrive to a significant main conclusion: at least for FabH inhibitors, virtual screening performance is not guaranteed by predictive QSAR models.status: publishe

    A desirability-based multi objective approach for the virtual screening discovery of broad-spectrum anti-gastric cancer agents

    No full text
    <div><p>Gastric cancer is the third leading cause of cancer-related mortality worldwide and despite advances in prevention, diagnosis and therapy, it is still regarded as a global health concern. The efficacy of the therapies for gastric cancer is limited by a poor response to currently available therapeutic regimens. One of the reasons that may explain these poor clinical outcomes is the highly heterogeneous nature of this disease. In this sense, it is essential to discover new molecular agents capable of targeting various gastric cancer subtypes simultaneously. Here, we present a multi-objective approach for the ligand-based virtual screening discovery of chemical compounds simultaneously active against the gastric cancer cell lines AGS, NCI-N87 and SNU-1. The proposed approach relays in a novel methodology based on the development of ensemble models for the bioactivity prediction against each individual gastric cancer cell line. The methodology includes the aggregation of one ensemble per cell line using a desirability-based algorithm into virtual screening protocols. Our research leads to the proposal of a multi-targeted virtual screening protocol able to achieve high enrichment of known chemicals with anti-gastric cancer activity. Specifically, our results indicate that, using the proposed protocol, it is possible to retrieve almost 20 more times multi-targeted compounds in the first 1% of the ranked list than what is expected from a uniform distribution of the active ones in the virtual screening database. More importantly, the proposed protocol attains an outstanding initial enrichment of known multi-targeted anti-gastric cancer agents.</p></div
    corecore